NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Isotype-aware inference of B cell clonal lineage trees from single-cell sequencing data

https://doi.org/10.1016/j.xgen.2024.100637

Weber, Leah L; Reiman, Derek; Roddur, Mrinmoy S; Qi, Yuanyuan; El-Kebir, Mohammed; Khan, Aly A (September 2024, Cell Genomics)

Full Text Available
DERNA Enables Pareto Optimal RNA Design

https://doi.org/10.1089/cmb.2023.0283

Gu, Xinyu; Qi, Yuanyuan; El-kebir, Mohammed (March 2024, Journal of Computational Biology)

Full Text Available
Consensus Tree Under the Ancestor–Descendant Distance is NP-Hard

https://doi.org/10.1089/cmb.2023.0262

Qi, Yuanyuan; El-kebir, Mohammed (January 2024, Journal of Computational Biology)

Full Text Available
Sapling: Inferring and Summarizing Tumor Phylogenies from Bulk Data Using Backbone Trees

https://doi.org/10.4230/LIPIcs.WABI.2024.7

Qi, Yuanyuan; El-Kebir, Mohammed (January 2024, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
Pissis, Solon P; Sung, Wing-Kin (Ed.)
Cancer phylogenies are key to understanding tumor evolution. There exist many important downstream analyses that take as input a single or a small number of trees. However, due to uncertainty, one typically infers many, equally-plausible phylogenies from bulk DNA sequencing data of tumors. We introduce Sapling, a heuristic method to solve the Backbone Tree Inference from Reads problem, which seeks a small set of backbone trees on a smaller subset of mutations that collectively summarize the entire solution space. Sapling also includes a greedy algorithm to solve the Backbone Tree Expansion from Reads problem, which aims to expand an inferred backbone tree into a full tree. We prove that both problems are NP-hard. On simulated and real data, we demonstrate that Sapling is capable of inferring high-quality backbone trees that adequately summarize the solution space and that can be expanded into full trees.
more » « less
Full Text Available
Leibniz International Proceedings in Informatics (LIPIcs):23rd International Workshop on Algorithms in Bioinformatics (WABI 2023)

https://doi.org/10.4230/LIPIcs.WABI.2023.21

Gu, Xinyu; Qi, Yuanyuan; El-Kebir, Mohammed (August 2023, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
Belazzougui, Djamal; Ouangraoua, Aïda (Ed.)
The problem of designing an RNA sequence v that encodes for a given target protein w plays an important role in messenger RNA (mRNA) vaccine design. Due to codon degeneracy, there exist exponentially many RNA sequences for a single target protein. These candidate RNA sequences may adopt different secondary structure conformations with varying minimum free energy (MFE), affecting their thermodynamic stability and consequently mRNA half-life. In addition, species-specific codon usage bias, as measured by the codon adaptation index (CAI), also plays an essential role in translation efficiency. While previous works have focused on optimizing either MFE or CAI, more recent works have shown the merits of optimizing both objectives. Importantly, there is a trade-off between MFE and CAI, i.e. optimizing one objective is at the expense of the other. Here, we formulate the Pareto Optimal RNA Design problem, seeking the set of Pareto optimal solutions for which no other solution exists that is better in terms of both MFE and CAI. We introduce DERNA (DEsign RNA), which uses the weighted sum method to enumerate the Pareto front by optimizing convex combinations of both objectives. DERNA uses dynamic programming to solve each convex combination in O(|w|³) time and O(|w|²) space. Compared to a previous approach that only optimizes MFE, we show on a benchmark dataset that DERNA obtains solutions with identical MFE but superior CAI. Additionally, we show that DERNA matches the performance in terms of solution quality of LinearDesign, a recent approach that similarly seeks to balance MFE and CAI. Finally, we demonstrate our method’s potential for mRNA vaccine design using SARS-CoV-2 spike as the target protein.
more » « less
Implications of non-uniqueness in phylogenetic deconvolution of bulk DNA samples of tumors

https://doi.org/10.1186/s13015-019-0155-6

Qi, Yuanyuan; Pradhan, Dikshant; El-Kebir, Mohammed (December 2019, Algorithms for Molecular Biology)

Full Text Available
Summarizing the solution space in tumor phylogeny inference by multiple consensus trees

https://doi.org/10.1093/bioinformatics/btz312

Aguse, Nuraini; Qi, Yuanyuan; El-Kebir, Mohammed (July 2019, Bioinformatics)

Abstract MotivationCancer phylogenies are key to studying tumorigenesis and have clinical implications. Due to the heterogeneous nature of cancer and limitations in current sequencing technology, current cancer phylogeny inference methods identify a large solution space of plausible phylogenies. To facilitate further downstream analyses, methods that accurately summarize such a set T of cancer phylogenies are imperative. However, current summary methods are limited to a single consensus tree or graph and may miss important topological features that are present in different subsets of candidate trees. ResultsWe introduce the Multiple Consensus Tree (MCT) problem to simultaneously cluster T and infer a consensus tree for each cluster. We show that MCT is NP-hard, and present an exact algorithm based on mixed integer linear programming (MILP). In addition, we introduce a heuristic algorithm that efficiently identifies high-quality consensus trees, recovering all optimal solutions identified by the MILP in simulated data at a fraction of the time. We demonstrate the applicability of our methods on both simulated and real data, showing that our approach selects the number of clusters depending on the complexity of the solution space T. Availability and implementationhttps://github.com/elkebir-group/MCT. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less

Search for: All records